On Estimating the Maximum Domination Value and the Skyline Cardinality of Multi-Dimensional Data Sets
نویسندگان
چکیده
The last years there is an increasing interest for query processing techniques that take into consideration the dominance relationship between items to select the most promising ones, based on user preferences. Skyline and top-k dominating queries are examples of such techniques. A skyline query computes the items that are not dominated, whereas a top-k dominating query returns the k items with the highest domination score. To enable query optimization, it is important to estimate the expected number of skyline items as well as the maximum domination value of an item. In this article, we provide an estimation for the maximum domination value under the distinct values and attribute independence assumptions. We provide three different methodologies for estimating and calculating the maximum domination value and we test their performance and accuracy. Among the proposed estimation methods, our method Estimation with Roots outperforms all others and returns the most accurate results. We also introduce the eliminating dimension, i.e. the dimension beyond which all domination values become zero, and we provide an efficient estimation of that dimension. Moreover, we provide an accurate estimation of the skyline cardinality of a data set.
منابع مشابه
Estimation of the Maximum Domination Value in Multi-dimensional Data Sets
The last years there is an increasing interest for query processing techniques that take into consideration the dominance relationship between objects to select the most promising ones, based on user preferences. Skyline and top-k dominating queries are examples of such techniques. A skyline query computes the objects that are not dominated, whereas a top-k dominating query returns the k object...
متن کاملSkyline Operator on Anti-correlated Distributions
Finding the skyline in a multi-dimensional space is relevant to a wide range of applications. The skyline operator over a set of d-dimensional points selects the points that are not dominated by any other point on all dimensions. Therefore, it provides a minimal set of candidates for the users to make their personal trade-off among all optimal solutions. The existing algorithms establish both t...
متن کاملLink-based Ranking of Skyline Result Sets
Skyline query processing has received considerable attention in the recent past. Mainly, the skyline query is used to find a set of non dominated data points in a multi-dimensional dataset. One of the major drawbacks of the skyline operator is the high cardinality of the result set. Providing the most interesting points of the skyline set (top-k) inherently involves the ranking of the skyline p...
متن کاملOn the super domination number of graphs
The open neighborhood of a vertex $v$ of a graph $G$ is the set $N(v)$ consisting of all vertices adjacent to $v$ in $G$. For $Dsubseteq V(G)$, we define $overline{D}=V(G)setminus D$. A set $Dsubseteq V(G)$ is called a super dominating set of $G$ if for every vertex $uin overline{D}$, there exists $vin D$ such that $N(v)cap overline{D}={u}$. The super domination number of $G$ is the minimum car...
متن کاملEffective Skyline Cardinality Estimation on Data Streams
In order to incorporate the skyline operator into the data stream engine, we need to address the problem of skyline cardinality estimation, which is very important for extending the query optimizer’s cost model to accommodate skyline queries. In this paper, we propose robust approaches for estimating the skyline cardinality over sliding windows in the stream environment. We first design an appr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJKBO
دوره 3 شماره
صفحات -
تاریخ انتشار 2013